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Introduction: 

RNA  interference  (RNAi)  is  a  conserved  biological  process  in  response  to  double-stranded 
RNA  (dsRNA)  \  DsRNAs  are  processed  into  short  interfering  RNAs  (siRNAs),  about  22 
nucleotides  in  length,  by  the  RNase  enzyme  Dicer.  The  siRNAs  are  then  incorporated  into  a 
silencing  complex  called  RISC  (RNA-induced  silencing  complex),  which  identifies  and  silences 
complementary  messenger  RNAs.  The  most  well  characterized  source  of  endogenous  triggers 
for  the  RNAi  machinery  are  the  microRNA  genes2,3.  Numerous  studies  have  demonstrated  that, 
in  animals,  miRNAs  are  transcribed  to  generate  long  primary  polyadenylylated  RNAs  (pri- 
rniRNAs)4,5.  Through  mechanisms  not  yet  fully  understood,  the  pri-microRNA  is  recognized  and 
cleaved  at  a  specific  site  by  the  nuclear  Microprocessor  complex6"10  to  produce  a  -70-90 
nucleotide  microRNA  precursor  (pre-miRNA)  which  is  exported  to  the  cytoplasm  11 12.  Only  then 
is  the  pre-miRNA  recognized  by  Dicer  and  cleaved  to  produce  a  mature  microRNA.  This 
probably  involves  recognition  of  the  2  nucleotide  3’  overhang  created  by  Drosha  to  focus  Dicer 
cleavage  at  a  single  site  -22  nucleotides  from  the  end  of  the  hairpin13. 

This  process  can  be  programmed  experimentally  in  order  to  repress  the  expression  of  any 
chosen  gene.  We  have  constructed  shRNA  libraries  (shRNA-mir)  that  uses  our  advanced 
understanding  of  miRNA  biogenesis.  ShRNA-mirs  are  modeled  after  endogenous  miRNAs, 
specifically  contained  in  the  backbone  of  the  primary  miR-30  microRNA14.  We  have  produced 
and  sequence-verified  more  than  200,000  shRNAs  covering  almost  all  of  the  predicted  genes  in 
the  mouse  and  human  genomes15. 

Large-scale  screens  of  small  interfering  RNA  (siRNA)  and  shRNA  collections  have 
generally  adopted  a  one-by-one  approach,  interrogating  phenotypes  in  a  well-based  format. 

This  requires  both  considerable  infrastructure  and  a  substantial  investment  for  each  cell  line  to 
be  screened.  Alternatively,  shRNA  collections  can  be  screened  by  assaying  enrichment  from 
pools,  but  this  limits  the  range  of  phenotypes  that  can  be  addressed.  Our  focus  is  identifying 
essential  genes  or  synthetically  lethal  genetic  interactions  through  shRNAs  that  are  selectively 
depleted  from  populations.  This  type  of  screen  holds  promise  for  the  discovery  of  novel  targets 
for  cancer  therapy  and  genetically  validated  combination  therapies.  We  have  linked  a  unique  60- 
nuclotide  DNA  barcode  to  each  shRNA  vector  to  allow  us  to  follow  the  fate  of  shRNAs  in 
populations  of  virally  transduced  cells.  If,  for  example,  a  particular  shRNA  provided  resistance  to 
a  growth  inhibitor  stimulus,  then  the  representation  of  its  associated  barcode  should  be 
increased  after  treatment.  If  a  given  shRNA  sensitized  a  population  to  a  specific  stress,  then  the 
relative  abundance  of  its  barcode  should  diminish  after  the  stress.  This  is  measured  by 
hybridizing  genomic  PCR  products  containing  the  barcodes  to  custom  microarrays  that  contain 
the  complement  of  these  sequences.  One  can  assess  cellular  response  to  different  treatments 
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by  comparing  barcode  representations  of  cell  populations  expressing  known  shRNA.  The 
development  of  this  highly  efficient  RNAi  library  together  with  the  ability  to  screen  pools  of 
genes,  provide  us  with  the  unique  opportunity  to  investigate  the  entire  genome.  Previously,  one 
such  negative  screen  was  reported;  however,  this  tested  only  -500  shRNAs  in  a  single  pool  16. 
We  therefore  sought  methods  that  allow  multiplex  analysis  of  phenotypic  outputs  on  a  genomic 
scale.  In  order  to  test  whether  such  a  screen  can  be  done,  we  conducted  a  pilot  screen 
identifying  essential  genes  that  were  selectively  depleted  from  populations  using  shRNAs  in 
breast  cancer  cells. 

Velcade,  the  only  proteasome  targeted  therapeutic  approved  by  the  FDA,  is  currently  in 
Phase  II  clinical  trials  in  breast  cancer,  though  its  molecular  mechanism  is  highly  disputed.  We 
are  examining  the  genes  responsible  for  granting  resistance  and  susceptibility  to  Velcade  using 
our  complex  short  hairpin  RNAi  library  that  results  in  the  silencing  of  specific  target  genes.  This 
technology  will  illustrate  resistance  to  chemotherapy  as  a  gain  of  barcode  representation  and 
increased  susceptibility  to  chemotherapy  as  loss  of  barcode  representation  in  a  population  of 
cells. 

Body: 

We  therefore  sought  methods  that  allow  multiplex  analysis  of  phenotypic  outputs  on  a 
genomic  scale.  In  order  to  test  whether  such  a  screen  can  be  done,  we  conducted  a  pilot  screen 
identifying  essential  genes  that  were  selectively  depleted  from  populations  using  shRNAs  in 
breast  cancer  cells. 

Pooled  libraries  drew  from  our  previous  collections  wherein  shRNAs  are  carried  in  a 
backbone  derived  from  miR-3017.  Combining  RNA  polymerase  II  promoters  with  miR-30-based 
shRNAs  permits  efficient  suppression  even  with  a  single-copy  integrant1819.  Target  cell 
populations  were  infected  such  that  each  cell  contained,  on  average,  a  single  integrated  virus, 
and  each  individual  shRNA  occupied  -1000  cells.  Three  parallel  infections  generated  biological 
replicate  samples.  Because  ourgoal  was  to  identify  essential  genes,  genomic  DNA  was 
prepared  from  each  replicate  at  three  time  points  during  a  simple  outgrowth  assay.  Each  shRNA 
cassette  contains  two  unique  identifiers:  the  shRNA  itself  and  a  random  60-nucleotide  barcode. 
Barcode  sequences  were  determined  for  the  human  shRNA  library,  and  custom,  multiplex 
format  microarrays  were  prepared  that  contained  both  barcode  and  half-hairpin  (HH)  probes  20. 
Proviral  DNA  fragments  encompassing  both  shRNAs  and  barcodes  were  amplified  from 
genomic  DNA  pools  and  hybridized  to  arrays  in  competition  with  a  common  reference. 
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We  screened  complex  populations  containing  6,000  (6K),  10,000  (10K)  or  20,000  (20K) 
shRNAs.  We  began  with  a  pooled  analysis  of  6000  (6K)  shRNAs  in  MCF-IOAand  MDA-MB- 
435.  The  10K  pool  was  introduced  into  MDA-MB-231,  T-47D  and  ZR-75-1  breast  cancer  cell 
lines.  The  most  complex  pool  (20K)  was  introduced  into  MCF-1 0A  to  allow  direct  comparison 
with  previous  screens  of  smaller  complexity.  In  all  cases,  cell  numbers  were  scaled  to  maintain  a 
representation  of  1000  cells  per  shRNA.  The  quality  of  each  screen  was  similar,  with  high 
correlations  between  biological  replicates.  We  assessed  the  consistency  of  the  MCF-1 0A 
screens  by  comparing  depleted  gene  sets  for  the  20K  pools.  FDR  thresholds  were  the  same  for 
both  data  sets  ( q  <  0.1),  but  the  fold-change  criterion  was  relaxed  from  2-fold  to  1 . 5-fold  for  the 
20K  screen  so  that  similar  numbers  of  candidates  were  compared.  A  set  of  172  genes  ( P  = 

1.123  x  10“9)  overlapped  in  both  data  sets,  despite  some  differences  in  the  protocols  used  to 
carry  out  each  screen.  This  suggests  that  a  pool  of  ~20K  shRNAs  can  be  effectively  screened. 

We  established  a  rigorous  data  analysis  pipeline  for  analyzing  pooled  shRNA  screens. 
Correlations  between  biological  replicates  were  high  but  diminished  at  later  time  points,  whereas 
correlations  between  the  reference  channels  remained  unchanged. Overall,  a  gene  was  scored 
as  a  candidate  if  either  its  barcode  or  shRNA  probe  showed  greater  than  2-fold  change  with  a 
false  discovery  rate  (FDR)  <10%. 

Viewing  this  portrait  of  shRNA  sensitivity  in  more  detail  revealed  a  number  of  pathways 
and  complexes  that  were  differentially  required  in  MCF-1 0A.  These  included  epidermal  growth 
factor  receptor  (EGFR),  an  effect  that  could  be  reproduced  pharmacologically  using  the  EGFR 
inhibitor  Tarceva  21 .  DNA  methyltransferasesalso  scored  either  above  or  close  to  the  threshold. 
In  accord  with  these  results,  MCF-1 0A  cells  showed  a  more  than  50-fold  greater  sensitivity  to  5- 
aza-deoxycytidine,  a  methyltransferase  suicide  substrate  22,  than  the  other  cell  lines.  As  a  final 
example,  numerous  proteasome  subunits  were  preferentially  depleted  from  MCF-1 0A.  These 
cells  showed  the  greatest  sensitivity  to  a  proteasome  inhibitor,  MG-13223.  Interestingly,  MDA- 
MB-435  showed  an  intermediate  level  of  sensitivity  to  the  drug,  and  this  was  reflected  precisely 
in  their  intermediate  level  of  depletion  of  proteasomal  shRNAs  during  the  screen. 

We  have  validated  a  highly  scalable  approach  for  screening  shRNA  libraries.  Although 
we  used  a  phenotypic  filter  reflecting  growth  and  survival,  virtually  any  characteristic  that  allows 
separation  of  phenotypically  distinct  cells  can  be  applied.  We  also  validated  the  ability  of 
functional  shRNA  screening  to  separate  cell  lines  based  on  their  genetic  vulnerabilities  in  a 
manner  that  reflects  their  already  defined  characteristics  (e.g.,  immortal  versus  tumor,  basal 
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versus  luminal).  Although  one  could  attribute  selective  dependency  to  culture  conditions  in  some 
cases,  the  overwhelming  concordance  of  the  shRNAs  that  affect  proliferation  and  survival  across 
these  lines,  many  of  which  are  cultured  identically,  strongly  argues  against  this  being  a 
pervasive  explanation.  In  all,  this  approach  enables  genome-wide  screens  for  tumor-specific 
vulnerabilities  to  be  carried  out  on  large  numbers  of  tumor  lines.  Moreover,  it  permits  rational 
searches  for  lesions  thatsynergize  with  existing  therapeutics  to  produce  a  path  toward 
genetically  informed  combination  therapies.  This  pilot  screen  has  allowed  us  to  develop  the 
tools  necessary  to  conduct  large-scale  negative  selection  screens  using  a  shRNA  library  with  up 
to  20,000  hairpins.  In  addition,  we  have  developed  a  highly  microarray  platform  with  the 
accompanying  statistical  methods  for  analysis.  This  microarray  platform  and  statistical  analysis 
is  currently  being  applied  to  our  Velcade  screen  that  was  conducted  in  MDA-MB-231  breast 
cancer  cells. 

Key  Research  Accomplishments: 

•  A  RNAi  screen  identifying  genes  that  are  important  for  the  proliferation  and  survival  of  five 
cell  lines  derived  from  human  mammary  tissue. 

•  These  studies  establish  a  practical  platform  for  genome-scale  screening  of  complex 
phenotypes  in  mammalian  cells  and  demonstrate  that  RNAi  can  be  used  to  expose 
genotype-specific  sensitivities. 

Reportable  Outcomes: 

Publications: 

Silva  JM,  Marran  K,  Parker  JS,  Silva  J  Golding  M,  Schlabach  MR,  Elledge  SJ,  Hannon  GJ, 

Chang  K.  Profiling  essential  genes  in  human  mammary  cells  by  multiplex  RNAi  screening. 

Science.  2008  Feb  1  ;319(5863):61 7-20. 

Conclusions: 

We  have  validated  a  highly  scalable  approach  for  screening  shRNA  libraries  in  breast  cancer  cells. 
We  can  conduct  screens  with  up  to  20,000  hps  and  identify  depleted  genes  from  a  complex 
population.  Our  pilot  screen  identified  genes  that  are  important  for  the  proliferation  and  survival  of 
five  cell  lines  derived  from  human  mammary  tissue.  We  will  use  these  microarray  and  statistical 
tools  for  to  study  genes  that  modify  sensitivity  to  the  proteasome  inhibitor,  Velcade.  This  screen 
was  conducted  in  MDAMB231  breast  cancer  cells  at  two  different  dosages  allowing  us  to  detect 
genes  that  will  enhance  sensitivity  or  increase  resistance  to  Velcade.  We  have  developed  our 
microarray  platform  and  analysis  methods  to  allow  us  to  detect  viable  candidates.  These 
candidates  will  then  be  validated  in  vitro  and  in  vivo. 
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